Automatic Linguistic Indexing of Pictures by a Statistical Modeling Approach
نویسندگان
چکیده
Automatic linguistic indexing of pictures is an important but highly challenging problem for researchers in computer vision and content-based image retrieval. In this paper, we introduce a statistical modeling approach to this problem. Categorized images are used to train a dictionary of hundreds of statistical models each representing a concept. Images of any given concept are regarded as instances of a stochastic process that characterizes the concept. To measure the extent of association between an image and the textual description of a concept, the likelihood of the occurrence of the image based on the characterizing stochastic process is computed. A high likelihood indicates a strong association. In our experimental implementation, we focus on a particular group of stochastic processes, that is, the two-dimensional multiresolution hidden Markov models (2-D MHMMs). We implemented and tested our ALIP (Automatic Linguistic Indexing of Pictures) system on a photographic image database of 600 di erent concepts, each with about 40 training images. The system is evaluated quantitatively using more than 4,600 images outside the training database and compared with a random annotation scheme. Experiments have demonstrated the good accuracy of the system and its high potential in linguistic indexing of photographic images. Index Terms { Content-based image retrieval, image classi cation, hidden Markov model, computer vision, statistical learning, wavelets.
منابع مشابه
Mining Digital Imagery Data for Automatic Linguistic Indexing of Pictures
In this paper, we present a new research direction, automatic linguistic indexing of pictures, for data mining and machine learning researchers. Automatic linguistic indexing of pictures is an imperative but highly challenging problem. In our on-going research, we introduce a statistical modeling approach to this problem. Computer algorithms have been developed to mine numerical features automa...
متن کاملALIP: The Automatic Linguistic Indexing of Pictures System
In this demonstration, we present the Automatic Linguistic Indexing of Pictures (ALIP) system. The system annotates images with linguistic terms, chosen among hundreds of such terms. The system uses a wavelet-based approach for feature extraction, a statistical modeling process for training, and a statistical significance processor to annotate images. We implemented and tested our ALIP system o...
متن کاملEvaluation strategies for automatic linguistic indexing of pictures
With the rapid technological advances in machine learning and data mining, it is now possible to train computers with hundreds of semantic concepts for the purpose of annotating images automatically using keywords and textual descriptions. We have developed a system, the Automatic Linguistic Indexing of Pictures (ALIP) system, using a 2D multiresolution hidden Markov model. The evaluation of su...
متن کاملAutomatic keyword extraction using Latent Dirichlet Allocation topic modeling: Similarity with golden standard and users' evaluation
Purpose: This study investigates the automatic keyword extraction from the table of contents of Persian e-books in the field of science using LDA topic modeling, evaluating their similarity with golden standard, and users' viewpoints of the model keywords. Methodology: This is a mixed text-mining research in which LDA topic modeling is used to extract keywords from the table of contents of sci...
متن کاملمدل دو مرحله ای شکاف- گلچین برای نمایه سازی خودکار متون فارسی
Purpose: Each language has its own problems. This leads to consider appropriate models for automatic indexing of every language. These models should concern the exhaustificity and specificity of indexing. This paper aims at introduction and evaluation of a model which is suited for Persian automatic indexing. This model suggests to break the text into the particles of candidate terms and to c...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEEE Trans. Pattern Anal. Mach. Intell.
دوره 25 شماره
صفحات -
تاریخ انتشار 2003